Efficient codebook for fast and accurate low resource ASR systems
نویسندگان
چکیده
Nowadays, speech interfaces have become widely employed in mobile devices, thus recognition speed and power consumption are becoming new metrics of Automatic Speech Recognition (ASR) performance. For ASR systems using continuous Hidden Markov Models (HMMs), the computation of the state likelihood is one of the most time consuming parts. Hence, we propose in this paper novel multi-level Gaussian selection techniques to reduce the cost of state likelihood computation. The proposed algorithms are evaluated within the framework of a large vocabulary continuous speech recognition task.
منابع مشابه
Efficient codebooks for fast and accurate low resource ASR systems
Today, speech interfaces have become widely employed in mobile devices, thus recognition speed and resource consumption are becoming new metrics of Automatic Speech Recognition (ASR) performance. For ASR systems using continuous Hidden Markov Models (HMMs), the computation of the state likelihood is one of the most time consuming parts. In this paper, we propose novel multi-level Gaussian selec...
متن کاملFast and Accurate OOV Decoder on High-Level Features
This work proposes a novel approach to out-of-vocabulary (OOV) keyword search (KWS) task. The proposed approach is based on using high-level features from an automatic speech recognition (ASR) system, so called phoneme posterior based (PPB) features, for decoding. These features are obtained by calculating time-dependent phoneme posterior probabilities from word lattices, followed by their smoo...
متن کاملTowards large vocabulary ASR on embedded platforms
In this paper we present an overview of an automatic speech recognition system implementation in the context of embedded systems. Specific challenges presented by low resource platforms will be addressed for the basic components of an ASR decoder. Our main objective is to utilize and modify the technology developed for large vocabulary ASR to achieve efficient LVCSR on embedded systems as well.
متن کاملEfficient implementation of ITU-t g.723.1 speech coder for multichannel voice transmission and storage
Dual-rate G.723.1 speech coder has been widely applied to real-time video and teleconferencing applications where reduced bandwidth and good voice quality is required. This paper presents an efficient implementation of G.723.1 speech coder. To simplify the excitation quantization procedure which is the most computationally demanding, we propose fast algorithms for adaptive codebook and fixed co...
متن کاملMemory space reduction for hidden Markov models in low-resource speech recognition systems
Low-cost recognition systems based on hidden Markov models (HMM) for mobile speech recognizers (mobile phones, PDAs) have a limited quantity of memory and processing power. Furthermore, the resources have to be shared between several applications. In this paper memory efficient HMMs were investigated for low-cost recognition platforms. The feature parameter tying HMM and subspace distribution c...
متن کامل